b'                  ~\\\n                     ~,~~,\n     ~                 ...\n                             \'f..)~~~\n                             ~~\'<.\n\n\n\n     NATIONAL \n\n     ARCHIVES \n\n      OFFICE of\n INSPECTOR GENERAL\n\n         Date \t        January 5, 2011\n         Reply to\n         Attn of \t     Office ofInspector General (OIG)\n\n         Subject \t     Management Letter No. 11-08, Electronic Records Archive Lacks Ability to Search\n                       Records\' Contents\n\n         To \t          David S. Ferriero, Archivist of the United States (N)\n\n         The Electronic Records Archives (ERA) program is critical to the future of the National\n         Archives and Records Administration (NARA) and the nation. From inception, ERA was\n         envisioned as the primary way our nation\'s electronic recordswill be preserved and accessed.\n         Thus, ERA is situated to become NARA\' s flagship system in a world dominated by born digital\n         records, and there is no alternative venue to which the public can tum for comprehensive access\n         to these records of our democracy. However, other than for select records, system limitations\n         will not allow users to conduct a content search of the comprehensive inventory of electronic\n         records ERA has and will continue to ingest. Instead, ERA will only allow users to locate\n         records by searching through metadata generated about the records, and not the text of the\n         records themselves. We believe this constraint has not been adequately communicated to NARA\n         stakeholders.\n\n         The inability of all American citizens to fully search the content of records ingested into ERA\n         will have a profound and adverse impact upon this nation. We believe this looming deficit in\n         capacity for American citizens and others to access the records of our democracy has not been\n         effectively communicated outside of this agency. Further, the need for resources which could\n         secure the equipment and staff to process and house the tsunami of records heading to NARA in\n         a manner which might facilitate full-text search capability has not been communicated to OMB,\n         or our Congressional oversight committees. This is akin to not calling for reinforcements when\n         our positions are surely about to be overrun. Figuratively speaking, there is no cavalry on the\n         horizon, and NARA opted not to send out an SOS.\n\n         In 2005 NARA awarded Lockheed Martin a design contract to build the foundation of ERA, and\n         it has been announced the contract will end on September 30,2011. As of January 5,2011 the\n\n\n\n NATIONAL ARCH IVES                  and \n\n RECORDS ADMINISTRATION \n\n\n8601 ADElPH I ROAD. ROOM 1300 \n\nCOLLEGE PARK. MD 20740\xc2\xb76001 \n\n       www.archives.gov\n\x0cactual costs of ERA totaled approximately $430 million.! This office has taken an active role in\nproviding audit coverage to ERA. In reports and testimony we have identified a troubled\nprogram. ERA has experienced failed deliverables, numerous changes in requirements, cost\noverruns, material key staff turnover, uncertain funding, technological challenges,\nmiscommunication between vital stakeholders etc. Throughout the saga this office has asked one\nquestion repeatedly:\n\n         At the end of the contract, when the contractor turns in their keys and badges,\n         what exact functionality will ERA provide to the most important stakeholder, the\n         American citizen?\n\nFrom program inception, research papers commissioned by NARA and published by the\nNational Academy of Sciences extolled the benefits of full-text content searching. Indeed,\nNARA contractual documents defined that ERA shall be able to search assets based on their\ncontents and be able to perform keyword, exact phrase, proximity and other types of searches.\nSimply speaking, this means that a document such as this very letter would be ingested into\nERA, and anyone now or in the future would be able to locate it by searching the body of this\ntext. This is consistent with the manner in which users navigate through existing publicly\navailable search engines such as Google, BING, Yahoo etc. In fact, the commercial search\nengine applied to ERA not only has full-text content search capability, but we are advised it is\nthe actual default option.\n\nHowever, once publicly available, the base ERA system, other than for a select population of\nrecords, will not support full-text search of the contents of individual records. 2 Instead, users\nwill generally only be able to search through metadata generated about the records, with limited\nrecords being made full-text searchable only after being identified as high-request records. This\nshortcoming will become ever more significant as billions of records begin to flow into NARA\nin the coming years. Indeed, a current key ERA staff member defined "we expect that the\npercentage of electronic ;ecords that are full text searchable will decrease over time, depending\non 1) volume of records coming in and 2) resources available."\n\nCompounding this situation is the fact NARA has deferred the capability for automated\ngeneration of records descriptions beyond the end of the ERA contract. In a computer system\nwhich does not search the content of records, the record descriptions take on additional\nimportance as the only searchable narrative of the record\'s contents (presuming the descriptions\nare made part of the searchable metadata). However, as ERA has now been set-up, such\ndescriptions will not be automatically generated by the system, but instead must be manually\ngenerated. Considering the massive amount of data expected to be put into the system, such a\nmanual process will invariably create substantial, perhaps insurmountable, bottlenecks. Without\nfull content searching, this potential delay in generating records\' descriptions will degrade\n\n\n1 According to the Office of Management and Budget\'s Federal IT Dashboard at\nhttp://it. usaspending.govl?q=contentl cost-summary&buscid=799.\n\n2 We realize not all records will have text to search (i.e., photos), and this Management Letter is limited to the\nsignificant portion of anticipated ERA-housed records which will contain text (i.e., e-mails, reports, databases, etc.).\n\x0cERA\'s usefulness in providing timely and full access to electronic records now and into the\nfuture.\n\nIt is unclear when, and by whom, the decision not to pursue full text searching of record\ncontents was made. ERA\'s previous director advised the OIG in April of2010 that this was\na policy decision that would be made in sometime around the end of2010 or the beginning of\n2011. Other senior NARA staff stated there was never a requirement to search the full text of\nrecord contents, and thus this decision was made at the very beginning of the contract. When\nquestioned about the reason for ERA\'s general lack of search capacity, senior NARA\nofficials have had a variety of responses. Some adamantly stressed they believed the ERA\nrequirements never included the capacity to search the full text of record contents. Others\nblamed the technical requirements, combined with reduced funding for the program. Still\nothers claimed they believed researchers would not want full text search capability, or that\nbeing able to completely search record contents is not desirable for an electronic archive.\nMany ofthe "architects" of ERA have since left the program and Federal service.\n\nWe look forward to your prompt response to these concerns. Should you have any questions,\nplease contact me at (301) 837-1532.\n\n\n\nt/?J/V/_J\nPaul Brachfeld \n\nInspector General \n\n\x0c'